AAAI.2024 - Doctoral Consortium

Total: 30

#1 Discovering Heterogeneous Causal Effects in Relational Data [PDF] [Copy] [Kimi]

Author: Shishir Adhikari

Causal inference in relational data should account for the non-IID nature of the data and the interference phenomenon, which occurs when a unit's outcome is influenced by the treatments or outcomes of others. Existing solutions to causal inference under interference consider either homogeneous influence from peers or specific heterogeneous influence contexts (e.g., local neighborhood structure). This thesis investigates causal reasoning in relational data and the automated discovery of heterogeneous causal effects under arbitrary heterogeneous peer influence contexts and effect modification.

#2 Knowledge Distillation from Single-Task Teachers to Multi-Task Student for End-to-End Autonomous Driving [PDF] [Copy] [Kimi]

Author: Pedram Agand

In the domain of end-to-end autonomous driving, conventional sensor fusion techniques exhibit inadequacies, particularly when facing challenging scenarios with numerous dynamic agents. Imitation learning hampers the performance by the expert and encounters issues with out-of-distribution challenges. To overcome these limitations, we propose a transformer-based algorithm designed to fuse diverse representations from RGB-D cameras through knowledge distillation. This approach leverages insights from multi-task teachers to enhance the learning capabilities of single-task students, particularly in a Reinforcement Learning (RL) setting. Our model consists of two primary modules: the perception module, responsible for encoding observation data acquired from RGB-D cameras and performing tasks such as semantic segmentation, semantic depth cloud mapping (SDC), ego vehicle speed estimation, and traffic light state recognition. Subsequently, the control module decodes these features, incorporating additional data, including a rough simulator for static and dynamic environments, to anticipate waypoints within a latent feature space. Vehicular controls (e.g., steering, throttle, and brake) are obtained directly from measurement features and environmental states using the RL agent and are further refined by a PID algorithm that dynamically follows waypoints. The model undergoes rigorous evaluation and comparative analysis on the CARLA simulator across various scenarios, encompassing normal to adversarial conditions. Our code is available at https://github.com/pagand/e2etransfuser/ to facilitate future studies.

#3 Tiered Coalition Formation Game Stability and Simulation [PDF] [Copy] [Kimi]

Author: Nathan Arnold

Expanding on a 2017 paper by Siler that introduced tiered coalition formation games, I have introduced a variant game and examined the stabilizability of both the original game and its variant. My thesis will contain further theoretical stability findings and the results and interpretation of a simulation based upon real data from video game matchups.

#4 Semi-factual Explanations in AI [PDF] [Copy] [Kimi]

Author: Saugat Aryal

Most of the recent works on post-hoc example-based eXplainable AI (XAI) methods revolves around employing counterfactual explanations to provide justification of the predictions made by AI systems. Counterfactuals show what changes to the input-features change the output decision. However, a lesser-known, special-case of the counterfacual is the semi-factual, which provide explanations about what changes to the input-features do not change the output decision. Semi-factuals are potentially as useful as counterfactuals but have received little attention in the XAI literature. My doctoral research aims to establish a comprehensive framework for the use of semi-factuals in XAI by developing novel methods for their computation, supported by user tests.

#5 Domain Engineering to Represent Human Behavior Using Multi-Agent Planning and Inductive Methodologies [PDF] [Copy] [Kimi]

Author: Salena Torres Ashton

This research combines multi agent planning, the psycholinguistics of question asking, procedural grounded theory, and hierarchical task networks to represent domains for automated planning.

#6 The Promise of Serverless Computing within Peer-to-Peer Architectures for Distributed ML Training [PDF] [Copy] [Kimi]

Author: Amine Barrak

My thesis focuses on the integration of serverless computing with Peer to Peer (P2P) architectures in distributed Machine Learning (ML). This research aims to harness the decentralized, resilient nature of P2P systems, combined with the scalability and automation of serverless platforms. We explore using databases not just for communication but also for in-database model updates and gradient averaging, addressing the challenges of statelessness in serverless environments.

#7 Identifying, Mitigating, and Anticipating Bias in Algorithmic Decisions [PDF] [Copy] [Kimi]

Author: Joachim Baumann

Today's machine learning (ML) applications predominantly adhere to a standard paradigm: the decision maker designs the algorithm by optimizing a model for some objective function. While this has proven to be a powerful approach in many domains, it comes with inherent side effects: the power over the algorithmic outcomes lies solely in the hands of the algorithm designer, and alternative objectives, such as fairness, are often disregarded. This is particularly problematic if the algorithm is used to make consequential decisions that affect peoples lives. My research focuses on developing principled methods to characterize and address the mismatch between these different objectives.

#8 Deep Reinforcement Learning for Communication Networks [PDF] [Copy] [Kimi]

Author: Raffaele Galliera

This research explores optimizing communication tasks with (Multi-Agent) Reinforcement Learning (RL/MARL) in Point-to-Point and Group Communication (GC) networks. The study initially applied RL for Congestion Control in networks with dynamic link properties, yielding competitive results. Then, it focused on the challenge of effective message dissemination in GC networks, by framing a novel game-theoretic formulation and designing methods to solve the task based on MARL and Graph Convolution. Future research will deepen the exploration of MARL in GC. This will contribute to both academic knowledge and practical advancements in the next generation of communication protocols.

#9 Towards Trustworthy Autonomous Systems via Conversations and Explanations [PDF] [Copy] [Kimi]

Author: Balint Gyevnar

Autonomous systems fulfil an increasingly important role in our societies, however, AI-powered systems have seen less success over the years, as they are expected to tackle a range of social, legal, or technological challenges and modern neural network-based AI systems cannot yet provide guarantees to many of these challenges. Particularly important is that these systems are black box decision makers, eroding human oversight, contestation, and agency. To address this particular concern, my thesis focuses on integrating social explainable AI with cognitive methods and natural language processing to shed light on the internal processes of autonomous systems in a way accessible to lay users. I propose a causal explanation generation model for decision-making called CEMA based on counterfactual simulations in multi-agent systems. I also plan to integrate CEMA with a broader natural language processing pipeline to support targeted and personalised explanations that address people's cognitive biases. I hope that my research will have a positive impact on the public acceptance of autonomous agents by building towards more trustworthy AI.

#10 Temporal Dependencies and Spatio-Temporal Patterns of Time Series Models [PDF] [Copy] [Kimi]

Author: Md. Khairul Islam

The widespread use of Artificial Intelligence (AI) has highlighted the importance of understanding AI model behavior. This understanding is crucial for practical decision-making, assessing model reliability, and ensuring trustworthiness. Interpreting time series forecasting models faces unique challenges compared to image and text data. These challenges arise from the temporal dependencies between time steps and the evolving importance of input features over time. My thesis focuses on addressing these challenges by aiming for more precise explanations of feature interactions, uncovering spatiotemporal patterns, and demonstrating the practical applicability of these interpretability techniques using real-world datasets and state-of-the-art deep learning models.

#11 Risk Management in Image Generative Models through Model Fingerprinting [PDF] [Copy] [Kimi]

Author: Changhoon Kim

My doctoral research delves into the realm of generative model fingerprinting, aiming to assign responsibility for the generated images. I introduce frameworks that modify generative models to incorporate each user's distinct digital fingerprint. This ensures that every piece of generated content carries a traceable identifier linked to its originator. The primary objective of my research is to achieve optimal attribution accuracy while ensuring minimal compromise on the model's performance. Additionally, I present strategies designed to enhance robustness against common adversarial manipulations, which malicious users might employ to obscure or remove these fingerprints.

#12 The Inter-batch Diversity of Samples in Experience Replay for Continual Learning [PDF] [Copy] [Kimi]

Author: Andrii Krutsylo

In a Continual Learning setting, models are trained on data with occasional distribution shifts, resulting in forgetting the information learned before each shift. Experience Replay (ER) addresses this challenge by retaining part of the old training samples and replaying them alongside current data, improving the model's understanding of the overall distribution in training batches. The crucial factor in ER performance is the diversity of samples within batches. The impact of sample diversity across a sequence of batches is investigated, introducing a new metric and an associated approach to assess and leverage this diversity. This exploration opens up significant potential for future work, as various strategies can be devised to ensure inter-batch diversity. Achieving optimal results may involve striking a balance between this novel metric and other inherent properties of a batch or sequence.

#13 Making AI Policies Transparent to Humans through Demonstrations [PDF] [Copy] [Kimi]

Author: Michael S. Lee

Demonstrations are a powerful way of increasing the transparency of AI policies to humans. Though we can approximately model human learning from demonstrations as inverse reinforcement learning, we note that human learning can differ from algorithmic learning in key ways, e.g. humans are computationally limited and may sometimes struggle to understand all of the nuances of a demonstration. Unlike related work that provide demonstrations to humans that simply maximize information gain, I leverage concepts from the human education literature, such as the zone of proximal development and scaffolding, to show demonstrations that balance informativeness and difficulty of understanding to maximize human learning.

#14 A Privacy Preserving Federated Learning (PPFL) Based Cognitive Digital Twin (CDT) Framework for Smart Cities [PDF] [Copy] [Kimi]

Author: Sukanya Mandal

A Smart City is one that makes better use of city data to make our communities better places to live. Typically, this has 3 components: sensing (data collection), analysis and actuation. Privacy, particularly as it relates to citizen's data, is a cross-cutting theme. A Digital Twin (DT) is a virtual replica of a real-world physical entity. Cognitive Digital Twins (CDT) are DTs enhanced with cognitive AI capabilities. Both DTs and CDTs have seen adoption in the manufacturing and industrial sectors however cities are slow to adopt these because of privacy concerns. This work attempts to address these concerns by proposing a Privacy Preserving Federated Learning (PPFL) based Cognitive Digital Twin framework for Smart Cities.

#15 Thesis Summary: Operationalizing User-Inclusive Transparency in Artificial Intelligence Systems [PDF] [Copy] [Kimi]

Author: Deepa Muralidhar

Artificial intelligence system architects can increase user trust by designing systems that are inherently transparent. We propose the idea of representing an AI system as an amalgamation of the AI Model (algorithms), data (input and output, including outcomes), and the user interface with visual interpretations (e.g. graphs, Venn diagrams). By designing human controls and feedback mechanisms for AI systems that allow users to exert control over them we can integrate transparency into existing user interfaces. Our plan is to design prototypes of transparent user interfaces for AI systems using well-known usability principles. By conducting surveys we will study their impact to see if these principles help the user to work with the AI system with confidence and if the user perceives the system to be adequately transparent.

#16 Learning Generalizable and Composable Abstractions for Transfer in Reinforcement Learning [PDF] [Copy] [Kimi]

Author: Rashmeet Kaur Nayyar

Reinforcement Learning (RL) in complex environments presents many challenges: agents require learning concise representations of both environments and behaviors for efficient reasoning and generalizing experiences to new, unseen situations. However, RL approaches can be sample-inefficient and difficult to scale, especially in long-horizon sparse reward settings. To address these issues, the goal of my doctoral research is to develop methods that automatically construct semantically meaningful state and temporal abstractions for efficient transfer and generalization. In my work, I develop hierarchical approaches for learning transferable, generalizable knowledge in the form of symbolically represented options, as well as for integrating search techniques with RL to solve new problems by efficiently composing the learned options. Empirical results show that the resulting approaches effectively learn and transfer knowledge, achieving superior sample efficiency compared to SOTA methods while also enhancing interpretability.

#17 A Hybrid AI Framework for Sensor-Based Personal Health Monitoring towards Precision Health [PDF] [Copy] [Kimi]

Author: Mbithe Nzomo

Non-communicable diseases are on the rise globally, resulting in accelerated efforts to develop personal health monitoring systems for early detection, prediction, and prevention of diseases. This is part of the vision of precision health, an emerging paradigm that focuses on preventing disease before it strikes by encouraging people to actively monitor and work towards improving their health. A key facilitator of this is the use of wearable sensors that can collect and measure physiological data.Although many sensor-based health monitoring systems have been proposed, interoperability of health data and processes, prediction of future health states, and uncertainty management remain open challenges. This research aims to alleviate these challenges through the development of a reusable framework integrating both data-driven and knowledge-driven AI within a hybrid AI architecture.

#18 Navigating Uncertainty in Epidemic Contexts with Reinforcement Learning [PDF] [Copy] [Kimi]

Author: Elizabeth Akinyi Ondula

My research integrates stochastic epidemic models with reinforcement learning to develop effective strategies or policies to inform operational decisions. The objective is to refine policies that are attuned to diverse outbreak dynamics and to offer a tool for informed planning in real-world settings.

#19 Target Focused Shallow Transformer Framework for Efficient Visual Tracking [PDF] [Copy] [Kimi]

Author: Md Maklachur Rahman

Template learning transformer trackers have achieved significant performance improvement recently due to the longdependency learning using the self-attention (SA) mechanism. However, the typical SA mechanisms in transformers adopt a less discriminative design approach which is inadequate for focusing on the most important target information during tracking. Therefore, existing trackers are easily distracted by background information and have constraints in handling tracking challenges. The focus of our research is to develop a target-focused discriminative shallow transformer tracking framework that can learn to distinguish the target from the background and enable accurate tracking with fast speed. Extensive experiments will be performed on several popular benchmarks, including OTB100, UAV123, GOT10k, LaSOT, and TrackingNet, to demonstrate the effectiveness of the proposed framework.

#20 Learning Pattern-Based Extractors from Natural Language and Knowledge Graphs: Applying Large Language Models to Wikipedia and Linked Open Data [PDF] [Copy] [Kimi1]

Author: Célian Ringwald

Seq-to-seq transformer models have recently been successfully used for relation extraction, showing their flexibility, effectiveness, and scalability on that task. In this context, knowledge graphs aligned with Wikipedia such as DBpedia and Wikidata give us the opportunity to leverage existing texts and corresponding RDF graphs in order to extract, from these texts, the knowledge that is missing in the corresponding graphs and meanwhile improve their coverage. The goal of my thesis is to learn efficient extractors targeting specific RDF patterns and to do so by leveraging the latest language models and the dual base formed by Wikipedia on the one hand, and DBpedia and Wikidata on the other hand.

#21 Learning from an Infant’s Visual Experience [PDF] [Copy] [Kimi]

Author: Deepayan Sanyal

Infants see a selective view of the world: they see some objects with high frequency and from a wide range of viewpoints (e.g., their toys during playing) while a much larger set of objects are seen much more rarely and from limited viewpoints (e.g., objects they see outdoors). Extensive, repeated visual experiences with a small number of objects during infancy plays a big role in the development of human visual skills. Internet-style datasets that are commonly used in computer vision research do not contain the regularities that result from such repeated, structured experiences with a few objects. This has led to a dearth of models that learn by exploiting these regularities. In my PhD dissertation, I use deep learning models to investigate how regularities in an infant's visual experience can be leveraged for visual representation learning.

#22 AI-Assisted Human Teamwork [PDF] [Copy] [Kimi]

Author: Sangwon Seo

Effective teamwork translates to fewer preventable errors and higher task performance in collaborative tasks. However, in time-critical tasks, successful teamwork becomes highly challenging to attain. In such settings, often, team members have partial observability of their surroundings, incur high cost of communication, and have trouble estimating the state and intent of their teammates. To assist a team in improving teamwork at task time, my doctoral research proposes an automated task-time team intervention system. Grounded in the notion of shared mental models, the system first detects whether the team is on the same page or not. It then generates effective interventions to improve teamwork. Additionally, by leveraging past demonstrations to learn a model of team behavior, this system minimizes the need for domain experts to specify teamwork models and rules.

#23 Learning Neuro-Symbolic Abstractions for Robot Planning and Learning [PDF] [Copy] [Kimi]

Author: Naman Shah

Although state-of-the-art hierarchical robot planning algorithms allow robots to efficiently compute long-horizon motion plans for achieving user desired tasks, these methods typically rely upon environment-dependent state and action abstractions that need to be hand-designed by experts. On the other hand, non-hierarchical robot planning approaches fail to compute solutions for complex tasks that require reasoning over a long horizon. My research addresses these problems by proposing an approach for learning abstractions and developing hierarchical planners that efficiently use learned abstractions to boost robot planning performance and provide strong guarantees of reliability.

#24 The Generalization and Robustness of Transformer-Based Language Models on Commonsense Reasoning [PDF] [Copy] [Kimi]

Author: Ke Shen

The advent of powerful transformer-based discriminative language models and, more recently, generative GPT-family models, has led to notable advancements in natural language processing (NLP), particularly in commonsense reasoning tasks. One such task is commonsense reasoning, where performance is usually evaluated through multiple-choice question-answering benchmarks. Till date, many such benchmarks have been proposed and `leaderboards' tracking state-of-the-art performance on those benchmarks suggest that transformer-based models are approaching human-like performance. However, due to documented problems such as hallucination and bias, the research focus is shifting from merely quantifying accuracy on the task to an in-depth, context-sensitive probing of LLMs' generalization and robustness. To gain deeper insight into diagnosing these models' performance in commonsense reasoning scenarios, this thesis addresses three main studies: the generalization ability of transformer-based language models on commonsense reasoning, the trend in confidence distribution of these language models confronted with ambiguous inference tasks, and a proposed risk-centric evaluation framework for both discriminative and generative language models.

#25 Does Robin Hood Use a Lightsaber?: Automated Planning for Storytelling [PDF] [Copy] [Kimi]

Author: Nisha Simon

Humans have been using stories to entertain, educate, and persuade audiences for centuries. The advent of modern AI tools in the form of Large Language Models (LLMs) such as chatGPT continues to fulfill this purpose. However while recent work has shown that LLMs can successfully be used for narrative generation, they lack coherence and can be prone to repetition and stilted language. Automated Planning can therefore be combined with Natural Language text generation to create narratives (stories) that are logical, coherent, and believable. A planning model provides scaffolding to an LLM so that the LLM's language generation is context-dependent, in order to allow users to create more coherent, logical, and believable stories in a variety of domains.